Automatic Assessment of Absolute Sentence Complexity

نویسندگان

Sanja Stajner

Simone Paolo Ponzetto

Heiner Stuckenschmidt

چکیده

Lexically and syntactically simpler sentences result in shorter reading time and better understanding in many people. However, no reliable systems for automatic assessment of sentence complexity have been proposed so far. Instead, the assessment is usually done manually, requiring expert human annotators. To address this problem, we first define the sentence complexity assessment as a five-level classification task, and build a ‘gold standard’ dataset. Next, we propose robust systems for sentence complexity assessment, using a novel set of features based on leveraging lexical properties of freely available corpora, and investigate the impact of the feature type and corpus size on the classification performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...

متن کامل

Automatic Sentence Ordering Assessment Based on Similarity

One of the tasks of text generation is sentence ordering since it is crucial for readability. Nevertheless, there is no common approach for evaluation of sentence ordering. The state-ofthe art methods are based on the comparison with a humanprovided order. However, in many cases it is impossible or time and resource consuming. Therefore, we propose three completely automatic approaches for sent...

متن کامل

بررسی شاخص های کیفیت گفتار در کودکان فارسی زبان طبیعی 5-4 ساله در شهرهای سمنان، بیرجند و تنکابن، سال 1383

Background and purpose: We can examine the language abilities of a person through five parameters of speech quality including speech fluency, speech complexity, speech exactness, speech rate and lexical accessibility. These parameters are examined by the secondary parameters including mean length of utterance (MLÜ), mean length of five long utterances, mean number of verb in sentence, mean nu...

متن کامل

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

An Improved Automatic EEG Signal Segmentation Method based on Generalized Likelihood Ratio

It is often needed to label electroencephalogram (EEG) signals by segments of similar characteristics that are particularly meaningful to clinicians and for assessment by neurophysiologists. Within each segment, the signals are considered statistically stationary, usually with similar characteristics such as amplitude and/or frequency. In order to detect the segments boundaries of a signal, we ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Automatic Assessment of Absolute Sentence Complexity

نویسندگان

چکیده

منابع مشابه

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

Automatic Sentence Ordering Assessment Based on Similarity

بررسی شاخص های کیفیت گفتار در کودکان فارسی زبان طبیعی 5-4 ساله در شهرهای سمنان، بیرجند و تنکابن، سال 1383

Improvement of generative adversarial networks for automatic text-to-image generation

An Improved Automatic EEG Signal Segmentation Method based on Generalized Likelihood Ratio

عنوان ژورنال:

اشتراک گذاری